236        Bioinformatics

txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene

The above annotation data package is for human (Homo sapiens) data from UCSC build

hg19 based on the knownGene Track. The available annotation packages for the genomes of

other organisms are available at “http://bioconductor.org/packages/3.5/data/annotation/”.

In the next step, we will use “annotatePeak()” function from ChIPseeker package[9] to

annotate the peaks by associating them to the nearest genes. This function also provides

the option “tssRegion=” that allows us to specify a max distance from the TSS in which the

peaks can be associated to the gene.

annotated_peaks <- lapply(bedfiles,

annotatePeak,

TxDb=txdb,

tssRegion=c(-1000, 1000), verbose=FALSE)

annotated_peaks

The above R codes apply the “annotatePeak()” function to annotate the peaks in the peak

signal files. The peak region was also set to any distance in the range (1000, 1000) from

the TSS of the gene. Figure 6.13 shows the annotation summaries for the three samples.

The summary includes the number of peaks annotated on the top and then the peak anno-

tation frequencies based on the genomic features (gene regions). We can notice that the

maximum frequencies are in the promoter region, which in this case is an indication for

the transcriptional activity in the gene associated to the peaks.

The ChIPseeker package provides several functions to visualize the annotated peaks. The

“plotAnnoBar()” creates a bar chart for the peak representation in the different genomic

regions (features).

plotAnnoBar(annotated_peaks)

Figure 6.14 shows a bar plot that depicts peak enrichment representation in the different

genomic regions of the genes. We can notice that most peaks are centered in the promoter

regions. This may look different if the ChIP-Seq is for TFs or histone marks.

Distribution of peaks relative to TSS:

The sites of TF binding and Poly II localization are found in the promoter regions of the

genes. Thus, distribution of peaks around TSS will give an idea about the activity of the

FIGURE 6.13  Annotation summaries for three ChIP-Seq samples.